Integrated Analysis of Whole-Genome Paired-End and Mate-Pair Sequencing Data for Identifying Genomic Structural Variations in Multiple Myeloma
نویسندگان
چکیده
We present a pipeline to perform integrative analysis of mate-pair (MP) and paired-end (PE) genomic DNA sequencing data. Our pipeline detects structural variations (SVs) by taking aligned sequencing read pairs as input and classifying these reads into properly paired and discordantly paired categories based on their orientation and inferred insert sizes. Recurrent SV was identified from the discordant read pairs. Our pipeline takes into account genomic annotation and genome repetitive element information to increase detection specificity. Application of our pipeline to whole-genome MP and PE sequencing data from three multiple myeloma cell lines (KMS11, MM.1S, and RPMI8226) recovered known SVs, such as heterozygous TRAF3 deletion, as well as a novel experimentally validated SPI1 - ZNF287 inter-chromosomal rearrangement in the RPMI8226 cell line.
منابع مشابه
SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data
SUMMARY We present SVDetect, a program designed to identify genomic structural variations from paired-end and mate-pair next-generation sequencing data produced by the Illumina GA and ABI SOLiD platforms. Applying both sliding-window and clustering strategies, we use anomalously mapped read pairs provided by current short read aligners to localize genomic rearrangements and classify them accord...
متن کاملViVar: A Comprehensive Platform for the Analysis and Visualization of Structural Genomic Variation
Structural genomic variations play an important role in human disease and phenotypic diversity. With the rise of high-throughput sequencing tools, mate-pair/paired-end/single-read sequencing has become an important technique for the detection and exploration of structural variation. Several analysis tools exist to handle different parts and aspects of such sequencing based structural variation ...
متن کاملFinding Structural Variants in Short Read, Paired-end Sequence Data with R and Bioconductor
Second-generation sequencing technologies, when used to sequence genomic DNA with paired-end reads, are extremely powerful and sensitive tools for discovering structural variation within a genome. Insertions and deletions as well as translocations (and inversions–not discussed directly here), if present in the genomic DNA under investigation, may be captured by examining paired-end reads that a...
متن کاملDELLY: structural variant discovery by integrated paired-end and split-read analysis
MOTIVATION The discovery of genomic structural variants (SVs) at high sensitivity and specificity is an essential requirement for characterizing naturally occurring variation and for understanding pathological somatic rearrangements in personal genome sequencing data. Of particular interest are integrated methods that accurately identify simple and complex rearrangements in heterogeneous sequen...
متن کاملDe novo assembly and genomic structural variation analysis with genome sequencer FLX 3K long-tag paired end reads.
The Genome Sequencer FLX System from Roche and 454 Life SciencesTM is a versatile sequencing platform suitable for a wide range of applications, including de novo sequencing and assembly of genomic DNA, transcriptome sequencing, metagenomics analysis, and amplicon sequencing. The Genome Sequencer FLX enables long sequence reads separated by kilobase distances of genomic DNA. These Long-Tag Pair...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 13 شماره
صفحات -
تاریخ انتشار 2014